News regarding mod_perl returning empty pages
News regarding mod_perl returning empty pages
am 26.08.2009 04:20:16 von Igor Chudov
--0015175cac9c7aea2c0472021740
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
I have an update on this story.
I set up a shell script that would notice this condition (empty pages
returned) and alert me immediately via "wall" within 10 seconds.
So I caught this in progress, before my 5 minute restarter would intervene,
and experimented quickly.
As I mentioned before, I have perlbal running in front on port 80, with
apache listening on localhost port 10080 and serving perlbal.
Since we discussed perlbal, I tried to access
http://localhost.algebra.com:10080/ instead of www.algebra.com. (
localhost.algebra.com resolves to 127.0.0.1 and is an alias for
www.algebra.com).
Results were materially different: instead of 100% failures that I had with
perlbal, I would only fail one out of many times.
The failure became intermittent instead of constant.
I believe that I have an explanation: only one worker happens to be messed
up. When I access apache directly, I would connect to a random worker, so my
failure would be intermittent. However, perlbal was set (in perlbal.conf) to
maintain persistent connection, so it would mostly bang on one worker
instead of randomly hitting them.
As a result, with perlbal, if that worker goes bad, then the whole website
is consistently not working.
In other words, perlbal is not the issue, as such, but it exacerbated the
problem due to the way I set it up.
I changed perlbal.conf to set persist_backend = off, which, I hope, will
have an effect of perlbal creating a separate connection for every request.
Should not have much effect, speed wise, since everything runs on localhost,
I hope.
My new perlbal.conf is included:
REATE POOL dynamic
pool dynamic add 127.0.0.1:10080
CREATE SERVICE balancer
SET listen = 0.0.0.0:80
SET role = reverse_proxy
SET pool = dynamic
SET persist_client = on
SET persist_backend = off
SET verify_backend = on
ENABLE balancer
# always good to keep an internal management port open:
CREATE SERVICE mgmt
SET role = management
SET listen = 127.0.0.1:60000
ENABLE mgmt
--0015175cac9c7aea2c0472021740
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
I have an update on this story.
I set up a shell script that would =
notice this condition (empty pages returned) and alert me immediately via &=
quot;wall" within 10 seconds.
So I caught this in progress, be=
fore my 5 minute restarter would intervene, and experimented quickly.
As I mentioned before, I have perlbal running in front on port 80, with=
apache listening on localhost port 10080 and serving perlbal.
Sinc=
e we discussed perlbal, I tried to access
ra.com:10080/">http://localhost.algebra.com:10080/ instead of
=3D"http://www.algebra.com">www.algebra.com. (
st.algebra.com">localhost.algebra.com resolves to 127.0.0.1 and is an a=
lias for ).
Results were materially different: instead of 100% failures that I had =
with perlbal, I would only fail one out of many times.
The failure =
became intermittent instead of constant.
I believe that I have an e=
xplanation: only one worker happens to be messed up. When I access apache d=
irectly, I would connect to a random worker, so my failure would be intermi=
ttent. However, perlbal was set (in perlbal.conf) to maintain persistent co=
nnection, so it would mostly bang on one worker instead of randomly hitting=
them.
As a result, with perlbal, if that worker goes bad, then the whole webs=
ite is consistently not working.
In other words, perlbal is not the=
issue, as such, but it exacerbated the problem due to the way I set it up.=
I changed perlbal.conf to set persist_backend =3D off, which, I hope, w=
ill have an effect of perlbal creating a separate connection for every requ=
est. Should not have much effect, speed wise, since everything runs on loca=
lhost, I hope.
My new perlbal.conf is included:
REATE POOL dynamic
=A0 poo=
l dynamic add
=
=A0
CREATE SERVICE balancer
=A0 SET listen =
=3D
=A0 SET role =3D reverse_proxy
=A0 =
SET pool =3D dynamic
=A0 SET persis=
t_client=A0 =3D on
=A0 SET persist_backend =3D off
=A0 SET verify_b=
ackend=A0 =3D on
ENABLE balancer
=A0
# always good to keep an in=
ternal management port open:
CREATE SERVICE mgmt
=A0 SET role =3D management
=A0 SET liste=
n =3D
ENABLE mgm=
t
=A0
--0015175cac9c7aea2c0472021740--
Re: News regarding mod_perl returning empty pages
am 26.08.2009 08:55:28 von Fred Moyer
On Tue, Aug 25, 2009 at 7:20 PM, Igor Chudov wrote:
> My new perlbal.conf is included:
>
> REATE POOL dynamic
> =A0 pool dynamic add 127.0.0.1:10080
>
> CREATE SERVICE balancer
> =A0 SET listen =3D 0.0.0.0:80
> =A0 SET role =3D reverse_proxy
> =A0 SET pool =3D dynamic
> =A0 SET persist_client=A0 =3D on
> =A0 SET persist_backend =3D off
> =A0 SET verify_backend=A0 =3D on
With verify_backend =3D on, perlbal makes an OPTIONS request to the
mod_perl server. That request usually returns an empty 200, which in
other scenarios could be viewed as a blank page.
How perlbal would send that back to the client is something I don't
know. But I would consider disabling that option as it doubles the
number of total requests to the backend mod_perl server. That's just
my opinion though, changing that failure could cause other failure
modes to surface.
If you want to use that option, you should write a mod_perl handler
that answers to a verify_backend url (see the perlbal docs) and
returns a few characters of text back to perlbal. If you see those
characters in your blank screens - that's the problem. Although I'm
skeptical that is what is happening.
Re: News regarding mod_perl returning empty pages
am 26.08.2009 14:22:48 von Igor Chudov
--0015174c3648500a9e04720a82b4
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Fred, thanks. I am afraid that verify_backend is more of an expensive
distraction, than something actually useful.
At this point in time I will try setting both persist_backend and
verify_backend to off.
I am also considering changing MaxRequestsPerChild and setting it to
something like 1,000.
None of this really solves the problem, but it might alleviate it very
considerably.
Thank you very much. Your help was invaluable and assisted me in keeping a
level head in dealing with this problem. .
--0015174c3648500a9e04720a82b4
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Fred, thanks. I am afraid that verify_backend is more of an expensive distr=
action, than something actually useful.
At this point in time I wil=
l try setting both persist_backend and verify_backend to off.
I am also considering changing MaxRequestsPerChild and setting it to so=
mething like 1,000.
None of this really solves the problem, but it =
might alleviate it very considerably.
Thank you very much. Your help=
was invaluable and assisted me in keeping a level head in dealing with thi=
s problem. .
--0015174c3648500a9e04720a82b4--
Re: News regarding mod_perl returning empty pages
am 26.08.2009 15:58:03 von Perrin Harkins
Igor,
Why don't you try logging the request size from your mod_perl server?
If it turns out that it knows when a request is zero bytes, you can
just kill the process in a cleanup handler.
Also, if you identify the PID of the broken process in this way, you
can look back through the logs to see what it did on the request
before it was broken.
- Perrin
Re: News regarding mod_perl returning empty pages
am 26.08.2009 17:12:33 von Igor Chudov
--0015175cac9c6ac63504720ce1eb
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
On Wed, Aug 26, 2009 at 8:58 AM, Perrin Harkins wrote:
> Igor,
>
> Why don't you try logging the request size from your mod_perl server?
> If it turns out that it knows when a request is zero bytes, you can
> just kill the process in a cleanup handler.
>
Do you refer to the response size, as opposed to request size?
If so... I like your idea. How would I do that?
>
> Also, if you identify the PID of the broken process in this way, you
> can look back through the logs to see what it did on the request
> before it was broken.
>
Yes, it is a very good idea indeed.
--0015175cac9c6ac63504720ce1eb
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
On Wed, Aug 26, 2009 at 8:58 AM, Perrin Hark=
ins
<pharkins@gm=
ail.com> wrote:
"border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padd=
ing-left: 1ex;">
Igor,
Why don't you try logging the request size from your mod_perl server?
r>
If it turns out that it knows when a request is zero bytes, you can
just kill the process in a cleanup handler.
Do you refer to the response size, as opposed to requ=
est size?
If so... I like your idea. How would I do that?
=A0
v>
, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Also, if you identify the PID of the broken process in this way, you
can look back through the logs to see what it did on the request
before it was broken.
Yes, it is a very good=
idea indeed.
=A0
--0015175cac9c6ac63504720ce1eb--
Re: News regarding mod_perl returning empty pages
am 26.08.2009 17:16:31 von Perrin Harkins
On Wed, Aug 26, 2009 at 11:12 AM, Igor Chudov wrote:
> Do you refer to the response size, as opposed to request size?
Yes.
> If so... I like your idea. How would I do that?
Just read the logging section of the apache docs. It's a common part
of the access log.
- Perrin
Re: News regarding mod_perl returning empty pages
am 26.08.2009 19:56:43 von Fred Moyer
On Wed, Aug 26, 2009 at 8:16 AM, Perrin Harkins wrote:
> On Wed, Aug 26, 2009 at 11:12 AM, Igor Chudov wrote:
>> Do you refer to the response size, as opposed to request size?
>
> Yes.
>
>> If so... I like your idea. How would I do that?
>
> Just read the logging section of the apache docs. =A0It's a common part
> of the access log.
This is a good reference:
http://perl.apache.org/docs/2.0/user/handlers/http.html#Perl LogHandler
Re: News regarding mod_perl returning empty pages
am 28.08.2009 18:49:14 von Igor Chudov
--0003255544ead7b0a904723676dc
Content-Type: text/plain; charset=ISO-8859-1
I think that I finally have a clue as to why those empty pages were
returned.
I have perlbal as front, and it was set to maintain persistent connections
with the apache backend listening on localhost.
I also have some configuration of apache that would essentially deny access
to certain user agents that were known to abuse my site before, such as
wget, RapidDownloader, ia_archiver, and a few more. (I know that wget can be
run with diff. user agents. The people who run wget against my site usually
are clueless. If they had enough clue to change user agent, they would also
realize that my site is dynamic and very deep. )
So, I am conjecturing, what happened was that perlbal selected some worker
for a persistent connection, and the first user agent to connect was one of
those "bad guys". Then the worker would reject all subsequent queries coming
on the same TCP connection, which would have the unfortunate effect that all
queries were rejected.
Since the time that I disabled persistent connections (which should not
matter too much on localhost), I have never had this problem where my server
would start returning empty pages.
I also verified with ab, that I have not had any performance hit due to
that.
The stock Ubuntu Hardy mod_perl is solid as a rock, now.
I want to thank everyone. It was a tough one because regular stress testing
would not trigger it.
As for those spiders, some of them are sort of legitimate, like ia_archiver
(which I thought was a rogue bot at some point as it would not provide a
webpage), but their expense in terms of traffic is not worth the benefit
that I get from them.
Igor
--0003255544ead7b0a904723676dc
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
I think that I finally have a clue as to why those empty pages were returne=
d.
I have perlbal as front, and it was set to maintain persistent c=
onnections with the apache backend listening on localhost.
I also h=
ave some configuration of apache that would essentially deny access to cert=
ain user agents that were known to abuse my site before, such as wget, Rapi=
dDownloader, ia_archiver, and a few more. (I know that wget can be run with=
diff. user agents. The people who run wget against my site usually are clu=
eless. If they had enough clue to change user agent, they would also realiz=
e that my site is dynamic and very deep. )
So, I am conjecturing, what happened was that perlbal selected some wor=
ker for a persistent connection, and the first user agent to connect was on=
e of those "bad guys". Then the worker would reject all subsequen=
t queries coming on the same TCP connection, which would have the unfortuna=
te effect that all queries were rejected.
Since the time that I disabled persistent connections (which should not=
matter too much on localhost), I have never had this problem where my serv=
er would start returning empty pages.
I also verified with ab, that=
I have not had any performance hit due to that.
The stock Ubuntu Hardy mod_perl is solid as a rock, now.
I want =
to thank everyone. It was a tough one because regular stress testing would =
not trigger it.
As for those spiders, some of them are sort of legi=
timate, like ia_archiver (which I thought was a rogue bot at some point as =
it would not provide a webpage), but their expense in terms of traffic is n=
ot worth the benefit that I get from them.
Igor
--0003255544ead7b0a904723676dc--